Generating properly weighted ensemble of conformations of proteins from sparse or indirect distance constraints.
نویسندگان
چکیده
Inferring three-dimensional structural information of biomacromolecules such as proteins from limited experimental data is an important and challenging task. Nuclear Overhauser effect measurements based on nucleic magnetic resonance, disulfide linking, and electron paramagnetic resonance labeling studies can all provide useful partial distance constraint characteristic of the conformations of proteins. In this study, we describe a general approach for reconstructing conformations of biomolecules that are consistent with given distance constraints. Such constraints can be in the form of upper bounds and lower bounds of distances between residue pairs, contact maps based on specific contact distance cutoff values, or indirect distance constraints such as experimental phi-value measurement. Our approach is based on the framework of sequential Monte Carlo method, a chain growth-based method. We have developed a novel growth potential function to guide the generation of conformations that satisfy given distance constraints. This potential function incorporates not only the distance information of current residue during growth but also the distance information of future residues by introducing global distance upper bounds between residue pairs and the placement of reference points. To obtain protein conformations from indirect distance constraints in the form of experimental phi-values, we first generate properly weighted contact maps satisfying phi-value constraints, we then generate conformations from these contact maps. We show that our approach can faithfully generate conformations that satisfy the given constraints, which approach the native structures when distance constraints for all residue pairs are given.
منابع مشابه
Investigating energy-based pool structure selection in the structure ensemble modeling with experimental distance constraints: The example from a multidomain protein Pub1.
The structural variations of multidomain proteins with flexible parts mediate many biological processes, and a structure ensemble can be determined by selecting a weighted combination of representative structures from a simulated structure pool, producing the best fit to experimental constraints such as interatomic distance. In this study, a hybrid structure-based and physics-based atomistic fo...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملConstrained Proper Sampling of Conformations of Transition State Ensemble during Protein Folding
Characterizing the conformations of protein in the transition state ensemble (TSE) is important for studying protein folding. A promising approach pioneered by Vendruscolo et al40 to study TSE is to generate conformations that satisfy all constraints imposed by the experimentally measured φ-values that provide information about the native-likeness of the transition states. Faisca et al12 genera...
متن کاملRepresenting protein native states using weighted conformation ensembles
The important structural and functional roles played by proteins in the proper functioning of cellular processes cannot be overstated. To comprehensively understand their functional behaviors, structural models derived from experimental data have been developed and these models have played a significant role in explaining the functional mechanisms of proteins. The paradigm “structure drives fun...
متن کاملA critical analysis of computational protein design with sparse residue interaction graphs
Protein design algorithms enumerate a combinatorial number of candidate structures to compute the Global Minimum Energy Conformation (GMEC). To efficiently find the GMEC, protein design algorithms must methodically reduce the conformational search space. By applying distance and energy cutoffs, the protein system to be designed can thus be represented using a sparse residue interaction graph, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- The Journal of chemical physics
دوره 129 9 شماره
صفحات -
تاریخ انتشار 2008